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1. INTRODUCTION 

The key responsibility of electricity industries is to supply sufficient and uninterruptible electric 
power to various consumers at different locations. The inadequate supply is one of the major problems faced 
by the consumers and the utility company. This results to in hardship and a high cost of living to people. This 
is a challenge to electricity companies because of inadequate generation, transmission, and distribution 
network planning which can be solved by appropriate forecasting technique. 

Forecasting as a process of estimating the future events and planning activities of utility companies 
play a vital role in the development of power sectors. Besides playing key roles in reducing generating costs, 
it is also very important to ensure sustainability and reliability of power systems. According to Tan et al. [1], 
load demand forecasting is one of the methods to keep the balance between power supplied and demand, with 
high accuracy to guarantee efficiency and stability. The correction actions to this balance resulted in 
numerous load distribution techniques like load shedding and other practices adopted by the electricity utility 
company. The economic implication of power utility such as security analysis, planning of power 
development, maintenance, scheduling, and dispatching are mainly operated based on accurate load 
forecasting. Accurate demand forecasting helps an electric utility company to make important decisions 
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including decisions on network planning, purchase, tariff structure planning, power distribution, load 
switching, and infrastructural development. The major reason for planning by utility companies is to ensure 
that future demands are met. It is also required to ensure an adequate and reliable supply of electricity to 
numerous consumers. A reliable model is required to actualize this objective. Zhang et al. [2], suggested that 
hybrid model approaches to electricity forecasting can be done. Load demand forecasting is categorized into 
three: i) short-term, ii) medium-term, and iii) long-term in terms of time horizon. Short-term is usually from 
one hour to one-week, medium terms are from a few weeks to a few months and even up to a year, while 
long-term ranges from one year and above. Since the deregulation of electricity industries, short-term load 
demand forecasting has been very important and challenging. Forecasts for different time period are 
important for different operations, protection, and control within power system sectors. According to 
Huang et al. [3], time series and filtering method was adopted with success. 

There are several different demand forecasting methods and models used for short-term forecasting. 
These methods include curve fitting, trendline, regression, gaussian process methods, time-series, and use of 
artificial intelligence. Three groups-statistical, machine learning, and hybrid-can be made up of the most 
popular and commonly applied time series forecasting models. Su et al. [4] argue that machine learning 
models were used to forecast short-term and mid-term loads using mean absolute error (MAE) performance 
metric. The underlying premise of time series approaches is that the data are internally structured, for as 
through autocorrelation, trend, or seasonal variation. 

The atmospheric condition (weather) exists and varies over a short time in a specific location. Also, 
the climate varies with time, seasonally, and annually. Therefore, the relationship between demand and 
climatic (temperature) change is nonlinear. In the studies by [5], regression was adopted with huge success in 
predicting wind and solar radiation for short time horizon. Forecasting load demand is based on historical 
load growth patterns as well as the rate of industrialization expansion. This approach was presented in [6]. 
Understanding how historical data or time series have behaved in the past requires knowledge of the data's 
pattern. The past pattern can be utilized as a reference to roughly choose the forecasting methods if it is 
expected that the behavior will persist in the future. The stability of the data pattern and the forecasting time 
both affect how accurate these methods are. Pattern based linear regression models were adopted in the 
studies by [7]. The results and application of forecasts depend on the time in which the forecast is carried out. 

This study primarily aimed to investigate the best trend and curve fitting forecasting method on 
weekly peak load demand. The time-series and curve fitting techniques such as exponential smoothing, 
moving average, quadratic, logarithmic, and hybrid (quadratic-logarithmic) trend were modelled to augment 
the already existing techniques. These techniques are not ambiguous and are less costly with accurate results. 
The choice of appropriate method is selected using the step-by-step procedure as shown in Figure 1. The 
combining forecast from two different forecast models gave the best performance. The hybridization used the 
advantage of each model to produce a better model. This is confirmed in this work as presented in Table 1. 

The performances are evaluated using several statistical metrics. The most applied metrics are mean 
absolute percentage error (MAPE), mean absolute error (MAE), mean square error (MSE), and root-mean 
square error (RMSE). However, cubic-root mean error (CRME) was introduced as performance metric in this 
study. This enhances the accurate measure of the forecast error. The calculated CRME values are lesser than 
the RMSE values, hence shows a better estimation or performance evaluation metric at the same dataset. This 
result indicated that it is a valid accuracy measure technique. 

The paper is organized as follows: i) Section 1 is the introduction; ii) Section 2 discusses the 
literature; iii) The methodologies are explained in section 3; iv) While the results and discussion were made 
in section 4; and v) The conclusion was then stated in section 5. 


2. STATE OF ART 

Despite the numerous literatures on short-term electricity load demand forecasting published since 
inception, research work on this area is still a challenging one to the electrical engineering scholar because of 
its highly complexity. The fundamental forecasting techniques are divided into two categories in the 
investigations by Hammad et al. [8], qualitative and quantitative techniques. The selection of the appropriate 
type depends mainly on the data availability. These methods are classified in terms of the degree of 
mathematical analysis used in the forecasting model. The Delphi technique, curve-fitting, and technology 
comparisons are some examples of the qualitative procedures that are employed based on the judgments of 
the experts. Most of the time, historical data are insufficient or nonexistent. The quantitative methods include 
regression analysis, decomposition methods, exponential smoothing, and Box-Jenkins methods. These 
methods are based on mathematical and statistical computations. The mathematical models of these 
techniques are discussed in [9]. Short-term time series analysis using regression were studied by Reddy and 
Vishali [10]. Kasule et al. [11], curve-fitting approach was used to forecast energy supply demand planning 
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using energy consumption data. and their performance is measured in terms of MAPE and RMSE. 
Kuster et al. [12], in their review of electrical load forecasting models notes that despite their simplicity, 
regression methods are still in common use for long-term and short-term forecasting. In contrast to 
regression, other statistical methods such as time series analysis are also used. Farahat and Talaat [13], 
presented a curve fitting prediction approach to forecast short-term load in which the optimal parameters of 
Gaussian model are obtained using a genetic algorithm that takes the error between actual and forecasted load 
as the cost function. Ismail et al. [14] argue that linear regression and polynomial curve fitting are used to 
forecast wind and solar power production. The mathematical relationship between one or more random 
variations and other non-random variables serves as the standard definition of the time-series statistical 
approaches. In the study by Li et al. [15], an improved short-term forecasting method based on ensemble 
empirical mode decomposition algorithm and multivariable linear regression method was developed. 

Zhang [16] investigated a combination of singular spectrum analysis to facilitate the construction of 
decomposition-based forecasting approaches for electricity load. Kabak and Taha [17] reported a 
development of a day-ahead price forecast using artificial neural network (ANN). By using information from 
previous data, in the studies of Mi et al. [18], exponential smoothing grey model were applied in predicting 
the next day electricity load demand. According to [19], results of nonfunctional parametric predictions for 
next-day forecasts were presented and compared with other with seasonal autoregressive integrated moving 
average (ARIMA) model. Two techniques of adjusting data aimed at creating cleaner information by 
mitigating abnormal data points due to daylight savings and holidays was reported in by Vu et al. [20]. The 
results showed an improvement against autoregressive and ANN based models. To eliminate seasonal 
variations in a historical data, Lu et al. [21] used grey theory, a moving average model, and its diagnosis 
stabilization approach to perform power system forecasting. A hybrid method spanning multiple regression 
and bootstrap processing of previous data were used to pre-process forecasting of electricity demand in China 
was developed by Wang et al. [22] yielding good results. 


3. METHODOLOGY 

The flowchart shown in Figure 1 is used as a guide for constructing the functional models for this 
study. The plot of the average weekly load data obtained from [23] was used to identify the existing pattern 
in the historical data as shown in Figure 2. Time series trendline methods are employed because the future 
behavior in the data is related to its past behavior. Following the steps as listed in Figure 1, each of the 
method model equations is obtained. In each of the model equations, the actual data value and time periods 
are substituted to obtain the forecast value. The forecast accuracy measure of the method is tested by using 
the performance metrics namely MAPE, RMSE, and the introduced CRME. The process continued until all 
the methods were tested. More so, two of the model equations were combined to obtain hybrid model. The 
performance results of each method were compared and the method with the lowest value is chosen as the 
best method for forecasting. 

Figure 2 is the plot of the average weekly load used for this study. The average peak load (MW) 
value is shown in the vertical axis and the time (weeks) are given in the horizontal axis. The plot shows two 
curves: i) the trendline which gradually increased with respect to time and ii) the repeated curve. The 
variability of the curve shows the pattern and behavior of the time series. Therefore, based on these 
observations the appropriate method for forecasting using the existing pattern were then being applied. 


3.1. Input data analysis 

The data for this study was collected from Enugu Electricity Distribution Company (EEDC) from 
January 1“ 2021 to 31* December 2021 [23], which contained an exogenous and endogenous variable. 
Therefore, the development of structure and components of the input data followed these four sequential 
tasks: i) data pre-processing, ii) correlation analysis, iii) entropy analysis, and iv) autocorrelation analysis 
which are briefly explained below. 


3.1.1. Correlation analysis 

The correlation analysis was carried out to identify the relevant influence of independent 
(exogenous and endogenous) variables on the load data. However, a weekly average association between the 
temperature and the peak load was done with varied time shifts. The outcome demonstrated low correlation 
coefficient values. Based on this research, the average weekly peak load, an endogenous variable, is the only 
factor used to determine the input data's composition. 
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Figure 1. Flowchart for the development of the models 
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Figure 2. Plot of weekly peak load and trend line 


3.1.2. Entropy analysis 
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This is statistical and mathematical methods for analyzing time series, since it does not consider any 
constraints on the probability distribution. Some consecutive historical values of the forecast variable are 
used to create the input data. The time series was submitted to entropy analysis in order to assess the impact 
of increasing the amount of contiguous data values in the input data. Shannon (ShEn) and Conditional 
entropy (ConEn) are the basic measures of information and rate of information generation respectively as 
presented in [24]. The expression for Shannon entropy is as given in (3), is obtained as: If x1, X2...Xn is a 
sequence of discrete random variables taken from a finite source y and have joint probabilities given as (1). 


PC Xyris n) = Pia ase) 


Then the Shannon entropy of this source is defined as (2). 


H(x),.. 
Calculation in (2) can be simply written as (3). 


H(X) = —sum(pn) x log (Pn) 


Xn) = = Ereg Laney Pry 


Int J Appl Power Eng, Vol. 13, No. 1, March 2024: 81-90 


, Xn) (log (Fr + 


Xn) 


(1) 


(2) 


(3) 


Int J App] Power Eng ISSN: 2252-8792 o 85 
Where H(X) is the Shannon entropy and pn is the probability of the sequence. The average mutual 


information, on the other hand, uses one extra quantity that links the entropy or uncertainty of two random 
variables and is defined as (4). 


I(X;Y) = Xico Like p(x, yi) log pe (4) 


P(x) 


According to (4) can be expanded in term of the entropy and conditional entropy as (5)-(7). 


IXY) = Dito Dito p(xily;) pw) log pe (5) 
= Dico Dyeo PO; )logp(xil|y;) — Xico Lito P(x,y logp (xi) (6) 
: I(X;Y) = H(X) — H(XIY) (7) 


The first term of (6) is H(X) (i.e entropy) and the second term is -H(X|Y) is the conditional 
entropy [24]. This suggests that the average mutual information is equal to the source's entropy less any 
residual uncertainty regarding the source's output after the value has been recovered. The conditional entropy 
of two random variable X and Y, N and M are the number of X and Y variables (for i = 0, 1, ...n and j = 
0,1,2...m), and P is the probability of the sequence. 


3.1.3. Auto-correlation analysis 

To determine the best correlation coefficients analysis was performed backwards in the time series, or 
to the past. The procedure involved finding the brief sequence that best captured the probable consumption 
trend based on how the load behaved during a similar time period in the previous days. The auto-correlation 
analysis in this study reveals a sharp decline in the auto-correlation coefficients. Based on the careful analyses, 
the composition of the input data relies essentially on the endogenous variables of the time series data. 


3.2. Identification and choice of model 

The pattern exhibited by the time series helps in determining the behavior of the data in the past. 
There are repeated and trend pattern as shown in Figure 2. The appropriate models suitable for this kind of 
pattern are to be used. The appropriate forecasting models are regression or trendline such as straight-line or 
linear, moving average, exponential smoothing, quadratic trend, logarithmic trend, and combined quadratic 
and logarithmic trend models. Each model was tested on the same weekly data size and results compared. 


3.2.1. Linear trend method 

This method is used to model relationship of load consumption and other factors such as day type and 
weather conditions. The method assume that the historical data will generally be consistent with future 
results. The general form of linear trend method and method of calculation in (8). 


Y, = ao + byt (8) 


Here Y: is the linear trend value in period t, ao is the intercept of the linear trend line, bı is the slope of 
the linear trend line and t is the time period. The value of ao and bı are determined using simple 
averaging as (9). 


_ Yhit-D%-¥) 
= SD? 


bi and ao = Y = byt (9) 


According to (9), the average value of the period, t is obtained from t = 


Drei) 
n 


n 
tei while the average 


value of the time series Y can be deduced from Y = with n being number of time periods or the 


number of observations. Using the available data and the appropriate formula, the linear trend model 
equation is finally written as (10). 


Y, = 2007.78 + 6.935t (10) 


Based on (10) the performance result is as shown in Table 1. 


Time-series trendline and curve-fitting-based approach to short-term ... (Ifeanyi Benitus Anichebe) 


86 m) ISSN: 2252-8792 


Table 1. Performance results of the models 


Methods MAPE (%) RMSE(MW) CRME (MW) 
Linear trend 5.565 24.4352 15.5608 
Three-week moving average 4.500 21.4825 14.4406 
Exponential smoothing 6.7813 30.9439 19.9698 
Quadratic trend 4.5402 21.0660 13.5715 
Logarithmic trend 5.4929 24.3490 15.5604 
Quadratic and logarithmic trend 4.5240 21.0631 13.5604 


3.2.2. Moving average method 

The moving average is a smoothing technique that generates an estimate of future values by looking 
at the underlying pattern of a set of data [25]. The forecast for the period under consideration is made using 
the average of the most recent (N) data values in the time series. Mathematically, a moving average forecast 
of order k as (11). 


_ (most recent k data values) _ Yt +Yt—1 +.Yt-K+1 
Fest = : Se (11) 
Here F,+; is the forecast of the time series for period t+1, Yis the actual value of the series in period t and Y+- 
1 is the previous value of the series. For forecasting the future time period, a three-week moving average 
=3) was utilized, and the performance outcome is shown in Table 1. 
N=3 tilized, and the perf t h Table 1 


3.2.3. Exponential smoothing 

This method is a special case of the weighted moving average method in which we select only one — 
the weighted for the most recent observation [25]. The weights for the other data values are computed and it 
becomes smaller as the observations move further into the past. The exponential smoothing model equation is 
as given in (12). 


Fray = aY, + 1-a@)F, (12) 


According to (13), F1 is the forecast value of the time series for the period t+1, Y; is the actual 
value of the time series in period t, F, is the forecast of the time series for period t and a is the smoothing 
constant with value of (0 < a < 1). The model equation for forecasting the third week using smoothing 
constant of 0.2 is given as (13), and the performance result of this method is also shown in Table 1. 


F; = 0.2Y) + 0.8F, (13) 


3.2.4. Quadratic trend model 

The use of a linear function to model trend is common. Sometimes, time-series have a curvilinear or 
non-linear trend. There are varieties of non-linear or curvilinear functions including quadratic, logarithm, and 
polynomial as in (14). 


Y = ao + aıt + azt? (14) 


According to (14), Y represents the value of the load (dependent variable), t is weeks and ao, a1, and az are the 
constants to be determined. The values of ao, a1, and a2 were determined by simultaneously solving the 
modified expressions as (14). 


E XY, = a Dt + a dt’? +a) t? (15) 


EY, = an + ati +a} t? 
EX Y, = a Vt? + a Yt? +Y tt 


The computation was made in Microsoft Excel for easy calculation using the required values. The obtained 
quadratic trend model in (16). The performance result of this model is as shown in Table 1. 


Y = 0.93 t? — 23.58t + 2179 (16) 


3.2.5. Logarithmic trend model 
The logarithmic model is another non-linear model that was applied for forecasting peak load. It is 
denoted by (17). 
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Y = antilog (C + dt) (17) 


According to (17), Y represents the dependent variable (load in MW) and t represents the weeks. The values 
of C and d were determined by simultaneously solving the modified expression of (17) as presented in (18). 


l MlogY = nC +d Yt (18) 


Z XlogY =CYt +d Yt? 


Then, substituting the obtained values of C and d into (17), the model equation was obtained as 
given as (19). The performance result of this method is as shown in Table 1. 


Y =antilog (3.302 + 0.00144 t) (19) 


3.2.6. Quadratic-logarithmic trend model 

It is possible to combine forecasts to create a composite forecast [26]. For instance, if three different 
forecasting methods are used in each set of data, a new forecasting model could be developed by combining 
the forecasting models. If the forecasting model values are given as Fi, F2, and F3, we could create a 
composite forecast as (20). 


Y = a) + aF, +a F, + a3F3 (20) 


The quadratic-logarithmic is the combination of quadratic model and logarithmic trend model developed 
using (17) and (19). By combining their forecast results, the hybrid model equation is obtained as presented 
in (21). The performance result of this method is as shown in Table 1. 


Y, = 0.93t? — 23.58t + 2,175.69 (21) 


3.3. Forecast and accuracy measure 

The choice of the best method is determined by the values of the performance evaluation metrics. 
The performance metric shows how good or bad a method is in performing forecasting [27]. The method 
with the lowest value of chosen metrics shows the best performance in terms of that metric while that with 
the highest value shows the least performing method. However, in this work, mean MAPE, RMSE, and the 
introduced CRME as (22), were used to test the correctness of these approaches. The results obtained by each 
method tested on the same dataset are as presented in Table 1. 


MAPE = ~ X% [=| x 100 
N At 


RMSE = Zy N (A — F)? (22) 
1 
N y Xal; = F,)3 


According to (22), A; represents the actual value at time t, F; is the forecasted value at time t and N is the 
number of data used. 


4. RESULTS AND DISCUSSION 

The performance of each model in forecasting week ahead electricity demand using the same dataset 
were evaluated using MAPE, RMSE, and CRME. The values were calculated using (22) and the results are 
presented in Table 1. According to Table 1, that shows the performance result of the developed methods in 
terms of MAPE, RMSE, and CRME evaluation metric. The percentage value of each method was also 
calculated and was used in deciding the best performed method. The method that produced the highest value 
of evaluation metric shows the least performance while the method with the lowest value is the best. 
However, the exponential smoothing method in row three has 6.7813, 30.9439 MW, and 19.9698 MW which 
represents 21.60%, 21.59%, and 21.56% for MAPE, RMSE, and CRME values respectively. This is then 
followed by linear trend shown in row one with 5.565, 24.4352 MW, and 15.5608 MW representing 17.71%, 
17.05%, and 16.80%, Logarithmic trend in row five with 5.4929, 24.3490, and 15.5604 MW representing 
17.49%, 16.99%, and 16.74%, moving average in row two with 4.500, 21.4825 MW, and 14.4406 MW 
representing 14.33%, 14.99%, and 15.59%. The quadratic trend shown in row four has 4.5402, 21.0660 MW, 
and 13.5715 MW which represents 14.46%, 14.70%, and 14.66% for MAPE, RMSE, and CRME 
respectively. The combination of the quadratic and logarithmic (hybrid) trend shown in row six produced 
4.5240, 21.0631 MW, and 13.5604 MW in terms of MAPE, RMSE, and CRME representing 14.41%, 
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14.68%, and 14.65% respectively. This method with the lowest values of the accuracy measure is selected as 
the best method among the tested methods. 

The result therefore indicated that the hybrid method made use of the advantages of the quadratic 
and logarithmic model individually to obtain a better performance as shown in Table 1. Generally, it can then 
be said that hybrid methods perform better than the single methods used separately in forecasting electricity 
demand. On the other hand, the obtained forecasted value from each method and the actual value were 
compared to ascertain the existing pattern and behavior of the models using the same dataset. The 
comparison plots are as shown in Figures 3 and 4. 

Figures 3 and 4 show the comparison plots between the actual average weekly demand against 
forecasted values obtained from each method and hybrid (quadratic-logarithmic trend) respectively. The 
average peak load value is represented in the vertical axis while the weeks are shown in the horizontal axis. 
The method legend is also shown within. The plots concentrated at the mean value of 2,122 MW with the 
actual demand curve showing variability as the weeks increases. The forecasted values show smooth and 
linear growth trend. Each of the methods produced good forecast result but the best method is the combined 
method of quadratic and logarithm as shown by the performance value presented in Table 1. 
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Figure 3. Comparison plot of the trend models 
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5. CONCLUSION 


The forecasting model must be reliable to maintain some level of accuracy with respect to the input 
variables. The study examined weekly electricity demand forecasting methods using a trendline approach. 
These methods include linear trendline, moving average, exponential smoothing, quadratic trendline, and 
logarithmic trendline. However, the developed quadratic and logarithmic model were combined to produce 
quadratic-logarithmic (hybrid) trendline method. The analysis was carried out using Microsoft Excel. The 
results from each method were compared using known performance evaluation metrics which include MAPE 
and (RMSE). Cubic root mean error (CRME) was introduced as a novel performance evaluation metric. The 
performance of these methods was determined based on the performance metric. The hybrid (quadratic- 
logarithmic) trendline method was found to outperform the other individual trendline methods. The result 
shows that it produced the lowest MAPE, RMSE, and CRME values representing 14.41%, 14.68%, and 
14.65% respectively. Therefore, these results indicated that hybrid models perform better than individual 
model operating separately. 


Int J Appl Power Eng, Vol. 13, No. 1, March 2024: 81-90 


Int J Appl Power Eng ISSN: 2252-8792 o 89 


This study helps us to understand that any forecasting method can produce good forecast, but it 
performs better if two or more models are combined. More so, this study pointed out that increase in the 
exponent values of the forecast error (forecast value-actual value) such as (Error)*) and (Error)*) can be used 
as performance evaluation metric. Hence, as the exponent values increases the lesser the mean error value. 

On the other hand, the study shows that each electricity utility company needs its unique technique 
for forecasting. What is applicable in one region or city may not be applicable in the other. This is because of 
forecasting factors and elements associated to each place that affects electricity demand. The obtained 
method in this study shows an improvement to the already simple line techniques applied by EEDC. If 
applied, this will enhance the power system distribution and network planning of the company thereby 
averting the frequent power failure and fluctuations. 
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